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PORTPROFIUNG 



(57) Abstract: A port profiling system detects unauthorized network usage. The port profiling system analyzes netwoiic communi- 
cations to determine the service ports being used. The system collects flow data from packet headers between two hosts or Internet 
Protocol (IP) addresses. The collected flow data is analyzed to determine the associated network service provided. A host data is 
maintained containing a profile of the network services normally associated with the host. If the observed network service is not 
^ one of the normal network services performed as defined by the port profile for the host, an alarm signal is generated and action can 
be taken based upon the detection of an Out of Profile network service. An Out of Profile operation can indicate the operation of a 
Trojan Horse program on the host, or the existence of a non-approved network application that has been installed. 
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10 NETWORK PORT PROFILING 

Cross Reference to Related Appucations 
This patent application claims priority to the United States provisional patent 
application serial no. 60/265,194 entitled "The Use of Flows to Analyze Network TrafiSc" 
15 filed on January 31, 2001, which is incorporated in its entirety by reference and made a 
part hereof. This ^plication is related to the United States non-provisional patent 
application serial no. 10/000,396 and PCT patent ^plication PCT/USOl/45,275, both 
entitled Flow-Based Detection of Network Intrusions'' and filed 30 November 2001, 
both of which are hereby incorporated by reference in their entirety and made part hereof. 

20 

Reference to Computer Program Listing Submitted on CD 
This application incorporates by reference the computer program listing appendix 
submitted on (1) CD-ROM entitled *Tlow-Based Engine Computer Program Listing" in 
accordance with 37 CJF.R- § 1.52(e). Pursuant to 37 C.F.R. § 1.77(b)(4), the material on 
25 said CD-ROM is incorporated by reference herein, said material being identified as 
follows: 

Sizein Date of File Name 

Bytes Creation 

30 

154,450 January 31, 2002 LANcope Code.txt 
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A portion of the disclosure of this patent document including said computer code 
contains material that is subject to copyright protection. The copyright owner has no 
objection to the facsimile reproduction by anyone of the patent document or the patent 
5 disclosure, as it appears in the Patent and Trademark Office patent file or records, but 
otherwise reserves all copyright rights whatsoever. 

Technical Field 

The invention relates generally to the field of network monitoring and, more 
10 particularly, to a detection system that monitors network activity by comparing network 
activity with a prestored profile and identifies suspicious port activity that may indicate 
unauthorized network activity. 

Background Art 

Networks have become indispensable for conducting all forms of business and 

15 personal communications. Networked syst^ns allow one to access needed information 
rapidly, collaborate with partners, and conduct electronic commerce. The benefits offered 
by Internet technologies are enormous. While computer networks revolutionize the way 
one does business, risks are iatroduced. Unauthorized network usage can lead to network 
congestion or even system failures. Furthermore, attacks on netwojks can lead to lost 

20 money, time, reputation, and confidential information. Effective netwoik monitoring can 
mitigate these system problems. 

High network availability is critical for many entezprises. Many performance 
problems are related to capacity issues. Unauthorized network usage can slow down the 
performance of mission critical applications and monopolize available bandwidth. Some 

25 unauthorized applications, like a Trojan Horse, can erase or degrade essential data as well 
possibly provide access to vital confidential information. 

Consequently, one primary danger to avoid is having outside intruders gain control 
of a host on a network. Once control is achieved, private company files can be 
downloaded, the controlled host can be used to attack other computers inside the firewall, 

30 or the controlled host can scan or attack computers anywhere in the world. Many 

organizations have pursued protection by the implementation of firewalls and intrusion 
detection systems (IDS). However, no avoidance measures are fail safe. Therefore, 
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mordtormg for the presence of unauthorized ^plications and unauthorized activity is 
important. 

Firewalls merely limit access between networks. Firewalls are typically designed 
to filter network traffic based on attributes such as source or destination addresses, port 
5 numbers, or transport layer protocols. Firewalls are susceptible to maliciously crafted 
traffic designed to bypass the blocking mles established. 

Almost all commercially available IDS are signature-based detection systems or 
anomaly-based systems. Signature-based detection systems piece together the packets in a 
comiection to collect a stream of bytes being transmitted. The stream is then analyzed for 

10 certain strings of characters in the data commonly referred to as "signatures." These 
signatures are particular strings that have been discovered in known exploits. The more 
signatures that are stored in a database, the longer it takes to do an exhaustive search on 
each data stream. For larger networks with massive amounts of data transferred, a string 
comparison q>proach is unfeasible. Substantial computing resources are needed to 

15 analyze all of tibe communication traffic. 

Even if a known exploit signature has been discovered, the signature is not useftd 
until it is has been installed and is available to the network. In addition, signature analysis 
only protects a system from known attacks. Yet, new attacks are being implem^ted all 
the time. Unfortunately, a signature-based detection system would not detect these new 

20 attacks and therefore, leaves the network vulnerable. 

Another approach to intrusion detection includes detection of unusual deviation 
from normal data traffic commonly referred to as "anomaUes." Like signature-based 
detection systems, many current anomaly-based intrusion detection systems only detect 
known methods of attack. Some of these known anomaly-based attacks include TCP/IP 

25 stack fingerprinting, half-open attacks, and port scanning. However, systems relying on 
known attacks are easy to dicumnavigate and leave the system vulnerable. In addition, 
some abnormal network traffic hsq>pens routinely, often non-maliciously, in normal 
network traffic. For example, an incorrectiy entered address could be sent to an 
unauthorized port and be interpreted as an abnormality. Consequently, known anomaly- 

30 based sj^ems tend to generate an undesirable number of false alarms, which creates a 
tendency for all alarms to be ignored. 
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Some known intrusion detection systems have tried to detect statistical anomalies. 
This approach involves measuring a baseline and then triggering an alarm when deviation 
is detected. For example, if a system typically has no traffic from individual workstations 
at 2 AM, activity during this time frame would be considered suspicions. However, 
5 baseline systems have t3qpically been ineffective because the small amount of malicious 
activity is masked by the large amounts of bi^y variable normal activity. On the 
aggregate, it is extremely difficult to detect the potential attacks. 

Other intrusion detection systems compare long tena profiled data streams to short 
term profiled data streams. One such system is described in U.S. Patent No. 6,321,338 to 

10 Porras et al entitled •'Network Surveillance." The system described in this patent does 
not necessarily analyze all tiie network traffic, but instead focuses on narrow data streams. 
The system filters data packets into various data streams and compares short term profiles 
to profiles collected over a long period. However, data traffic is typically too varied to 
meaningfully compare short term profiles to long term profiles. For example, merely 

15 because the average (File Transfer Protocol) FTP streams may be 3 megabytes over the 
long term does not indicate that a 20 megabyte stream is an anomaly. Consequently, these 
systems generate a significant amount of false alarms or Ihe malicious activity can be 
masked by not analyzing the proper data streams. 

Failure to detect the operation of malicious unauthorized application, such as a 

20 Trojan Horse, can cause serious harm to a company. A Trojan Horse is a program in 
which harmfid code is contained inside an apparently harmless program or data in such a 
way that it can gain control of the computer or otherwise do it designed form of damage. 

However, other unauthorized network usage can also be harmfiil. Employees may 
waste time and resources by installing and playing games over the network. An 

25 authorized web site may utilize cmcial bandwidth by providing materials such as pictures, 
streaming audio, or movies. Even a chat program can waste time and network assets. 
Valuable resources can also be monopolized by these types of unauthorized network 
activities. 

Consequently, a monitoring system is iieeded that can detect the operation of new 
30 unauthorized network services. The system needs to be able to differentiate between 
legitimate network usage and unauthorized activity. Furthermore, the detection system 
must be able to fimction even with the data traffic of larger networks. The systems need 
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alarm upon detection of the operation of a new network service in use on any monitored 
host computer. 

5 

Disclosure of the iNvamoN 
The present invention provides a more accurate and reliable method for detecting 
imauthorized network usage based upon port profiling. This novel detection systotn does 
not require a known signature database of known attacks. Instead, the monitoring system 

10 inspects all inbound and outbound activity and identifies new services that are not listed 
on that host's service profile. The computational simplicity of the technique allows for 
operation at much higher speeds than is possible with other detection systems on 
comparable hardware. 

According to one aspect of the invention, the detection system works by assigning 

15 data packets to various client/server flows. Statistics are collected for each determined 
flow. Then, the flow statistics are analyzed to determine if the flow appears to be 
legitimate traffic or a probe. Each legitimate flow is associated with a service port. A 
comparison is made between the detected service and the service profile of that host If 
the service is not listed in the service profile for that host, an alert is issued and ^propriate 

20 action can be taken. 

Generally speaking, the port profiling system analyzes network communication 
traffic for unauthorized network activity. The system collects flow data firom packet 
headers between two hosts or Ihtemet Protocol (IP) addresses. A determination is made 
whether monitored flow is a valid coimection with data flow. The collected flow data is 

25 analyzed to determine the associated service provided and which hosts on the network t 
act as a cUent and server for each valid connection or data transferred. 

A host data structure is maintained containing a profile of the services normally 
associated with the host. This stored information becomes the allowed network services 
profile. The network services bemg used by every host in a predefined group of hosts are 

30 determined. The system compares the allowed network services with observed network 
services. If the observed network service is not one of tiie allowed network services 
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performed by that host as defined by the port profile for that host, an alarm signal is 
generated. 

An embodiment of the system is capable of displaying indicia indicating observed 
network services during a monitoring period. Additionally, the system can display an 
5 indication of whether the observed network services has been previously seen during the 
presentm^t period. Furthermore, indicia indicating whether fhe observed network 
services is not an allowed network sarvice can be displayed. 

The netwodc service profile can be built automatically based upon network 
services observed during a profile g^eration time period. Furthermore, the network 
10 service profile can be edited to modify the allowed network services in the allowed 

network services profile of a host or to modify the allowed network services profile for a 
block of network addresses. 

Brief Description of the Drawings 
15 Benefits and fiulher features of the present invention will be apparent fi*om a 

detailed description of preferred embodiment thereof taken in conjunction with the 
following drawings, wherein like elements are referred to with like reference numbers, and 
wherein: 

FIG. 1 is a functional block diagram illustrating netwodc data collection for a port 
20 profiling system constructed in accordance with a preferred embodiment of the present 
invention. 

FIG.2 is a fimctional block diagram illustrating the operation of &e port profiling 

system. 

FIG. 3 is a diagram illustrating headers of datagrams. 
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FIG. 4 is a functional block diagram illustrating an exemplary normal TCP 
communication. 

FIG. 5 is a functional block diagram illiistxating the operation of network services, 
FIG. 6 is a functional block illustrating a port profiling engine* 
5 FIG. 7 is a screen shot illustrating the port profiling display. 

FIG. 8 is a functional block diagram illustrating hardware architecture. 
FIG. 9, consisting of FIGS. 9A through 9C, are flow charts of the program threads 
in an exemplary embodiment of the invention. 



10 Best Mode 

The described embodiment discloses a system that provides an efficient, reliable 
and scalable method of monitoring unauthorized services by a host computer. 
Unauthorized network service are detected by a port profiling engine that monitors activity 
to differentiate between abnormal activity and normal communications. Port profiling 

15 does not rely on analyzing the data ofpackets for signatures of known attacks. Analyzing 
character strings for known attack signatures is extrraiely resource intensive and does not 
protect agaiiist new unknown attacks. Instead, the present system analyzes 
communication flows to determine if any service used is a service normally performed by 
that host. Those skilled in the art will readily appreciate that numerous communications in 

20 addition to those explicitly described may indicate unauthorized network activity. By 
analyzing commumcations for abnormal port activity, unauthorized network usage can be 
determined without the need for resource intensive packet data analysis. 

Having a port profile available for a computer network, and the ability to build up 
automatically, edit, display, monitor, display changes, and alarm on changes is extremely 

25 valuable to a network administrator who wants to know what is transpiring over the 

network. Service port monitoring is the one of best ways to detect the start up of Trojan 
Horse programs that conomimicate over the network, as well as undesired applications that 
may be installed by users. 

However, it is useful to discuss the basics of Internet communications to gain an 

30 understanding of the operation of the port profiling engine. Consequentiy, initially an 
overview of a network data flow will be discussed. Following the overview is a detailed 
description of the operation of the port profiling system. Next, discussions on various 
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aspects of Internet communications will follow. A detailed functionality of the port 
profiling engine of the present invention is described in further detail in reference to FIGs. 
5 through FIG. 9. 

5 Port ProjSting Data Collection 

Turning to the figures, in which like numerals indicate like elem^ts 
throughout the several figures, FIG. 1 provides an overview of data collection 
for a port profiling engine 155 in accordance with an exemplary embodiment 
of the present invention. The port profiling engine 155 monitors network 

10 computer communications. The network con5)uter communications are 
routed via a known global computer network commonly known as the 
Internet 199. In accordance with an aspect of the invention, the port 
profiling engine 155 is incorporated into a monitoring appliance 150, together 
with a database 160 that stores information utilized in the port profiling 

15 methodology. 

The operating environment of the port profiling engine 155 is contemplated to 
have numerous hosts connected by the Internet 199» e.g. Host #1» Host #2, Host #3 (also 
referred to as H1-H3 respectively). Hosts are any computers that have full two-way access 
to other computers on the Internet 199 and have their own unique (Intemet Protocl) IP 

20 address. For example Host #1 has an exemplary IP address of 208.60.239.19. The 
Intemet 199 connects clients 110 with a host server 130 in known clienb^server 
relationship. 

In a typical configuration, some computers are referred to as **servers", while 
others are referred to as "clients/' A server computer such as Host #2 130 typically 
25 provides responses to requests firom client computers and provides services, data, 
resources, and the like. In contrast, a client computer such as Host #1 110 typically 
requests and utilizes the services, data, resources, and the like provided by the server. 

It is known in the art to send communications between hosts via the Intemet 199. 
The Intemet Protocol (IP) is the method by which data is sent &om one host computer to 
30 another on the Intemet 199. Each host on the Intanet 199 has an IP address that uniquely 
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identifies it firom all other computers. Illustrated is a user/client 110, host #1 (HI), with an 
exemplary IP address of 208.60.239.19 and a server, host #2 (H2), with an IP address of 
128.0.0.1. 

When data is transmitted, the message gets divided into packets 101. Packets 101 
5 are discussed in more detail in reference to FIG. 3. Each IP packet 101 includes a header 
that contains both the sender's Internet address and receiver's Intemet address. The 
packets 101 are forwarded to the computer whose address is specified. As shown, a client 
110 conmiunicates with a s^er 130 by sending packets 101 of data. A packet 101 is a 
unit of data that is routed between an origin and destination. As illustrated, messages are 
10 segmented into numerous packets 101 and routed via the Intemet 199 to the receiving 
host. The receiving host reassembles the stream of packets 101 to recreate the original 
message, which is th^ handled by appUcation programs r unning on the receiving 
computer system. 

However, some of the hosts may be intruders 120, commonly referred to as 
15 hackers or crackers. Intruders 120 exploit vulnerable computers. As shown in FIG. 1, the 
intruder 120 is a host with its own exemplary IP address of 110.5.47.224. The intruder 
120 also conmiunicates by sending packets 101 via tiie Intemet 199. As previously stated, 
the packets 101 contain the IP address of the origmator and destination to ensure proper 
routing. As shown, the stream of packets 101 sent by the intruder 120 can be interleaved 
20 with the packets 101 sent by other hosts. The packets 101 contain header information that 
enables the receiving host to reassemble the interleaved stream of packets into the original 
messages as sent. 

Normal client/server commimication activity includes sending e-mails, Web 
traffic, file transfers, and the like. Communications via the Intemet 199 need to be sent to 

25 a specific IP address and to a specific service contact port. A *^ort" is known to those 

skilled in the art as an arbitrarily assigned number to which a particular type of computing 
service is assigned in conventional Intemet computer-to-computer communications, e.g. 
web trafSc is conventionally on port 80, FTP traffic on ports 20 and 21, etc. The IP 
address specifies a specific host while the service contact port number identifies a 

30 particular server program or service tiiat tiie host computer may provide. Present day port 
numbers for Intemet Protocol version 4 (Ipv4) range fi-om 0 to 65,535. Intemet Protocol 
next generation (Ipng) or Intemet Protocol version 6 (Ipv6) is designed to allow for the 
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expansion of the Internet including expanded routing and address capabilities. The header 
will still include the source address and destination addresses as well as a next header in 
which the host's service port can be defined. 

As shown in FIG. 1 , a number of frequently-used services or processes have 
5 conventionally assigned service contact port numbers and are referred to as well-known 
port numbers maintained by the Internet Assigned Numbs Authority (lANA). These 
assigned port mmibeis are well known in the art and are typically the low nunbered ports 
between 0 and 1023. Currently, certain higher numbered ports have also been registered. 

A service port chart in FIG. 1 lists some common services that present day 

10 Internet-based computer systems may provide. Outgoing anail typically utilizes the 
known Simple Mail Transfer Protocol (SMTP) which is implemented over the service 
contact port 25. For the Hypertext Transfer Protocol (HTTP) conomunications, Web 
browsCTs open an ephemeral high port number to initiate Web traffic that is sent to the 
host server port 80. File Transfer Protocol (FTP) control commxmications are sent to the 

15 server port 21, while FTP data transfer originates from port 20. The FINGER service 
utilizes service port 79, the domain name service (DNS) utilizes service port 53, and 
Telnet communications utilize service contact port 23. As illustrated, conmion services 
are typically associated with specific predetermined service contact ports. 

For discussion and explanation purposes, illustrated in FIG. I are four exCTGiplary 

20 flows, Fl through F4, between by client host #1 1 10 and service host if2 130. Flow Fl is a 
file transfer utilizing the File Transfer Protocol (FTP). As shown, the file transfer (flow 
Fl) is delivered by a stream of packets 101 (PI — P3) that will be reassembled by the 
receiving host 110. 

After the file transfer is convicted, the cUent 110 initiates an HTTP Web session 
25 (flow F2) with server 120. Those skilled in the art understand that a Web session typically 
occurs when an Internet browser computer program such as MICROSOFT INTERNET 
EXPLORER or NETSCAPE NAVIGATOR requests a web page &om a World Wide Web 
(WWW) service on port 80. Packets P4, P5, P6, and P9 are associated with ihe Web 
traffic of flow F2. These packets may contain data such as a JPG format picture to be 
30 displayed, text» a JAVA program, or other informational materials to be displayed or 
handled by the client's Internet browser program. 
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Continuing the example of FIG. 1, while the web session of flow F2 is still open, 
the client 110 sent an email illustrated by flow F3. As shown, the email packets of flow 
F3 may be interleaved with the previously opened Web session of flow F2. As illustrated, 
packets P7, P8, and P12 contain tihie e-n^il message. 
5 Finally, the client 1 10 requests another web page from the server 120, initiating yet 

anotiier HTTP flow F4. Packets P9, PIO, Pll, P12, and P14 represent the new Web 
traffic. 

intmders 120 send data over tiie netwodc intending to do harm or to scout details 
about the hosts on the network that will let them do harm in future. Because intruders 120 

10 have differmt objectives, intraders 120 typically send commimications that are not normal 
for client/server communications. 

For example, intruders may scan numerous high number ports which would not 
happ^ in normal client/server communications or an intruder may send a User Datagram 
Protocol (UDP) packet, which is commonly used with streaming media, with no data 

15 attached. An intruder may attempt to identify which operating system a host is utilidng 
by sending a packet Avifh an undefined set of TCP flags. A high number of TCP packets 
101 to a single host firom another host may indicate a 'lialf open'' attack trying to tie up the 
target's resources. Each of fliese suspicious activities is not customarily seen in normal 
network traffic. 

20 Probes and other communications that do not transfer data are not considered 

legitimate flows. If an unauthorized network usage detection system analyzed and 
consequently generated an alarm for each of these communications, mmaerous alarms 
would be generated creating a tendency for a network administrator to ignore all alarms. 
Instead, these communications are customarily blocked by firewalls or detected by an 

25 intrusion detection system. Therefore, an exemplary port profiling system wiU monitor 
just the legitimate flows to detect unauthorized network usage. Consequently, a port 
profiling engine wlQ monitor flows to determine legitimate flows in which data is 
transferred. 

In accordance with an aspect of the invention, the port profiling engine 155 works 
30 by assigning data packets 101 to various legitimate flows. A legitimate flow is a 
communication in which, data is sent and acknowledged. Port scans and some other 
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illegitimate flows typically do not send data with the packets 101, or if they do, the 
packets are usually rejected by a TCP **Reject" packet or a ICMP ^^Unavailable" packet. 

The engine 155 collects port information associated with each flow and stores this 
information in a database 160. As shown in FIG. 1, the database 160 comprises a flow 
5 data structure 162 and a host data structure 166. 

The flow data structure 162 stores collected flow information such as the IP 
addresses. The engine 155 determines which host has a lower IP address and assigns that 
host IPO. The other host is assigned IPl. PortO is associated with IPO and portl is the 
service connection port for IPl. The flow data structure 162 also stores time and other 
10 related packet information derived from the packet header. In the disclosed embodiment, 
this time information (e.g. time of the first packet, time of the last packet) is utilized to 
measure the elapse of time for purposes of flow delimiting. 

The host data structure 166 maintains the port profiling information. Port profiling 
entails keqping two lists for each of the hosts: 1) a list by port number (0, 65,536), 
15 protocol (TCP or UDP), and type of operation (client or server) for all allowed network 
services that are in the hosts profile; and 2) a corresponding list of network services that 
have been seen today. The host data structure is described in greater detail in reference to 
FIG. 2. 

In a typical preferred configuration for tiie present invention, a monitoring 
20 appliance 150 operating a port profiling engine 155 is coupled to one of the network 
devices 135 such as routers, switches, or hubs or to a tap in a Intemet backbone link. The 
monitoring appliance 150 monitors the communications among the various "inside" hosts 
132,134,136 on the network and outside hosts 120, 110 in the attempt to detect 
unauthorized network activity. laside hosts are those hosts 132,134,136 of an organization 
25 in which a network administrator is concerned with unauthorized network usage. It will 
be recognized that the inside network includes a hub 135. Typically, inside hosts are 
behind a firewall 170. Of course, those skilled in the art will s^reciate that the port 
profiling engine 155 or ^pliance 150 can operate vidth or without the existence of any 
firewalls 170. 



30 
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Port Profiiling 

FIG. 2 illustrates port profiling of the host computers on an "inside" network. A 
network device 135 such as a switch, token ring hub, router, or the like connects a 
plurality of "inside"' hosts 132, 134, 136 on the inside network such as a local area 
5 network (LAN), wide area network (WAN), or the like. The network monitoring 
appliance ISO connects to the network device 135 in a know mzimer such that the 
monitoring device 150 can scan all the packets 101 that get delivered to or sent from each 
host 132, 134, 136 on the internal network. 

In accordance with an aspect of the invention, the port profiling engine 155 works 

10 by assigning data packets 101 to various flows. The port profiling engine 155 analyzes 
the flow data to distinguish legitimate flows from probes. A legitimate flow is a 
commimication in which data is sent by TCP and acknowledged, by UDP packets that are 
not rejected, or a local multicast or broadcast flow. 

Legitimate data flows (as differentiated from "probes") are observed to take place 

15 between two hosts, one generally identifiable as acting as the "cUmt" and the other host 
acting as the "server". Only the server's TCP or UDP port number is used to identify the 
netwozk service because the servo: port is normally indicative of the network service being 
used by the two hosts. Typically, a host computer will act either as a client or a server. 
Normally, severs are set up to respond firom request initiated by clients, while client 

20 machines initiate the requests. Of course, a host can be configured to operate as bofli a 
client and a server. 

The port profiling engine 155 detemunes unaufliorized network usage by 
comparing observed current network services with a stored profile of allowed network 
services for a particular host. The port profiiling engine 155 updates a host structure 166 

25 upon observance of a network service by that host. The host data structure 166 stores 
""seen today" information about all observed hosts, and maintains a port profile of 
allowable network services for '^inside** hosts. Port profiling entails keeping two Usts for 
each of the hosts: 

1) a profile list by Protocol and Port Number (0 to 65,535) (for TCP and UDP), 
30 and simply by Protocol (number) for other Transfport Layer protocols, and type of 

operation (as a Client or Server) for of all allowed operations (the 'Tort Profile"); 
and 
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2) a corresponding "seen today** list of what operations have been seen today. 
As illustrated in reference to FIG. 2, the host data structure includes a bit map that stores 
data for the 32 most commonly seen network services and a table capable of storing data 
for additional network services for each IP address. Because most hosts only utilize a 
5 relatively small number of services, the host data structure includes 32 of the most 

commonly seen services in the predefined bit maps for each host in order to save on data 
storage space. In order to track otiier possible services, a table is appended that can list up 
to an additional 10 services for each IP address. As described below^ other transport layer 
protocols observed are also included among flie t&a additional services in the port profile. 

10 As described in referrace to FIG. 3, there are 25 1 other protocols that can be 

designated by the "next protocol" field in the IP header besides the three cormnon Internet 
Transport Layer protocols (ICMP, TCP, and UDP). There are generally no port numbers 
with these other protocols, so these other protocols are indicated simply by the "protocol 
number." The Intemet Control Message Protocol (ICMP), referred to as protocol number 

15 1 , is not tracked because any host using IP would be using ICMP. The host using these 
protocols are aU shown as **servers" since many of the protocols are peer-to-peer and the 
transport protocol number indicates the service. 

Once the port profile is accurate, the port profiling engine 155 compares the two 
lists to detect operations that are "Out of Profile" and provide an alarm to tiie system 

20 operator. An Out of Profile operation can indicate the operation of a Trojan Horse 

program on the host, or the existence of a non-approved network application that has heea 
installed. 

The generation of a profile of the allowable services for each host can be a trying 
task for network administrators. Therefore, the network service profile can be 
25 automatically generated. The port profiling engine 155 can generate the allowable 
networks services profile by updating the profile with the used network services as 
described below. 

Automatic Configuration of Host Port Profiles 
30 Initially a network administrator may not know all the client and server 

appUcations that are running on their network. The system is able to operate in different 
modes that permit the initial port profile to be built up automatically. The update is 
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accomplished in modes. The first mode is initial service usage collection. The second 
mode displays new services and the services are automatically updated ia the profile. In 
mode three, new services are not automatically added to tiie host profile, but no alarm is 
generated by the port profiling engine 155. In mode four, an alarm is generated for each 
5 network service used that is not in that host's profile. 

Model . The port profile *Toints*' [host IP address, protocol (TCP or UDP or 
Transport Layer number), port (for TCP and UDP), and Server or Client] are added 
to the profile as they are observed. 
10 Mode 2. like Mode 1, except a list of hosts with new profile Points is displayed as 

they are observed each time period. The new Points are automatically added to the 
port profile at the end of the time period. 

Mode 3. like Mode 2, except the new Points are not added to the port profile at the 
end of the time period. 

15 Mode 4. Like Mode 3, except an alarm (by email, email to beeper, or SNMP Trap 

packet) is issued as soon as an Out of Profile operation is observed. 

A new installation might run in Mode 1 for a week to accumulate Points associated with 
the various services commonly utilized by various hosts on the "inside'* netwoik. Then 

20 the system is shifted to Mode 2. The port profile is inspected to make sure no undesired 
Points are in the profile at this time. Each day in Mode 2 the new Points are inspected to 
make sure they meet the organization's network usage poUcy since they are added 
automatically. When most of the existing port profile Points seem to have been collected, 
the system can shift to Mode 3 and then to Mode 4, or go directly to Mode 4. Any Point 

25 outside of the profile will generate an alarm while operating in Mode 4. A manual editor 
can modify the port profile Points as defied necessary. 

Manual Profile Editor 

In accordance with another asfpect of the invention, the profile editor lets port 
30 profile Points be added or deleted by the network adrmnistrator. If an unwanted port 
profile Point was added during the automatic configuration, it can be removed (e.g., if a 
Trojan Horse was discovered and removed firom a host). It also allows blocks of IP 
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addresses to have the same port profile Point data (TCP or UDP port. Server or Client) 
added or deleted in a single operation. For example a range of IP addresses might be 
assigned by DHCP to visitors with laptops. Standard profiles could be assigned to the 
whole block (Web, Email CUents, no Servers) even though not all of the IP addresses were 
5 seen during the automatic configuration. 

Port Profiling Operation Example 

As illustrated in FIG.2, computei#2 134 is an inside host computer whose 
communications are being monitored by a network monitoring appliance ISO coxmected to 
10 a switch 135. The communications sent to and received firom computer#2 134 IP address 
192.168.1.25. 

As previously discussed, a port profile has been generated for computer#2. The 
host data port usage shows that computer#2 134 in its normal operations acts as a client as 
shown by the C_PROFILE structure. The port profile for the host 134 indicates that the 

15 host 134 is authorized or historically has acted as a cheat for tiie DNS» HTTP, HTTPS, 
SMTP, POP3, and FTP services. However, as illustrated, the host 134 has not utilized 
either the File Transfer Protocol or the Kerberos protocol in the last 24 hours. 

The example illustrated in FIG. 2 shows four authorized activities network services 
that have been seen today or used today. In the example, the authorized four services 

20 used today are activities utilizmg HTTP, UDP, POP, SMTP as a client 

When acting as a client accessing the world wide web, flie host 134 opens an 
ephCTieral high nimiber port, as shown in reference to FIG. 5, when initiating HTTP 
communications and sends the commxmications to port 80 at the IP address of the server 
receiving the request. The port profiling engine then indicates on the "seen today data 

25 structure ("CLIENT') that host 134 acted as a client on port 80 utilizing the HTTP service. 
No alarm is generated because the client port profile for host 134 (C_Profile) list port 80 
as an accepted network service. This service is shown by the 80-TCP Point 182. As 
anticipated, the communication utilized the DNS service on UDP port 53* Consequently, 
the "seen today" structure is also indicates this Point as being utilized shown by 53-lJDP 

30 Point 181. Again, because this Point (53-lJDP) is listed in the C_Profile structure, no 
alarm is generated. 
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In addition, computer#2 134 has requested to view email stored on an email server. 
The request is sent to port 1 10 of the email server, which corresponds to the POPS service. 
The host 134 acts as a chent when requesting to view its anail. Similarly, contputer 2 134 
has sent an email to another host utilizdng Simple Mail Transport Protocol (SMTP). 
5 Consequently, port profile engine 155 indicates on the "seen today" client structure 

(CLIENT) that Point 1 10 (POPS service) 184 and Point 25 (SMTP service) 183 has been 
utilized as a client. Likewise, since cHent Points 110 and 25 are listed in the C Profile 
stmcture for tiiat host 134, no alarm is generated. 

For purposes of discussion, assume that the. clock for Computer#2 134 drifts. The 

10 clock can be updated to reflect the current time by requesting the time service on port 37 
of a time server. Time is not defined as one of the 32 commonly seen services. 
Consequently, the port profiling engine updates the upended table with the applicable 
port (37) and the Transport Layer protocol (UDP) as shown by port 37 Point 185 in the 
table. Iftime service was not in profile for the host, an alarm would be gmerated. After 

15 reviewing the alarm, message, the system adnodnistrator could use the manual port editor 
and add the Point 37 to the CJProfile. As illustrated. Point 37 has been included in the 
port profile for Computer 2 134, which will prevent fiirttier alarms from that host using the 
time service. 

Continuing the example, the inside network utilizes a virtual private network 
20 (VPN) on port 50. Consequently significant traffic is communicated on a proprietary 
Transport Layer protocol to port 50. Since this service is not one of the defined common 
services, the port profiling engine tracks the port and Transport Layer protocol in the 
appended table of the host structure. 

Extending the example, software facilitating instant messaging or "chat" has been 
25 installed in Computer#2 134. Computer#2 134 sends chat messages to H2 120 on port 
8080 at IP address 208.60.232.19. This chat service is also not one of the defined 
common services and is tracked in flie table of non-standard services. Although 
computer#2 initiated the service and is acting like a client, fiie service is not in the 
C_Profile for Computer#2 134 as shown by UDP-8080 Point 186. Consequently, an alarm 
30 is generated. The system operator can remove the installed chat program, if desired, or 
take other appropriate action. 
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Likewise, Computer#2 1 34 has installed an autitorized web page. Wheti the web 
page is accessed, the host 134 is now acting like a server. The port usage as shown in the 
S_Profile for Compute]:#2 134 does not include port 80 as a server i.e. no bit is set in the 
80-TCP column of the S_Profile. Consequently, an alarm wiU be generated. The system 
5 administrator can remove the unauthorized web page that may be monopolizing valuable 
band width. 

Not all unauthorized usage is non-malicious in nature. Despite firewalls and 
intrusion detection systems, a Trojan Horse may still get installed on a network. Many 
Trojan Horse applications Usten for a request emanating from another computer (H3 in 

10 this example) on a pre-selected, normally unused, port. When a communication is 

directed to the pre-selected port, the Trojan Horse responds by uploading information to 
the requesting address indicated in a received packet. Assume in FIG. 2, H3 has installed 
a Trojan Horse on Computer#2 134. Upon receiving a request from IP address 
110.5.47.224, the application begins sending data from port 6969. Computer#2 is acting 

15 as a server on port 6969. This service is out of profile for this host 134. An alarm is 
generated and ^appropriate action can be taken. 

It will now be appreciated that the disclosed methodology of unautiiorized netwoik 
usage detection is accomplished at least in part by comparing a predetermined port profile 
for a host against that host's recent activity. The addresses and port numbers of 

20 communications are easily discerned by anal}^ of the header information in a datagram. 

Packet 

Referring now to FIG. 3, and inasmuch as an imderstanding of Internet data 
packets is helpful for constructing embodiments of the present invention, a description of 

25 such packets, also called "datagrams", will next be provided as an aid to understanding. A 
packet or datagram 101 is a self-contained, independent entity or unit of data carrying 
sufficient information to be routed from a source to a destination computer without 
reliance on earlier exchanges between the source and destination computer. Packets 101 
have a header and a data segment as illustrated by FIG. 3. The temi '"packet" in present- 

30 day parlance has graerally replaced the term ^'datagram". 

Restated, a packet 101 is the unit of data that is routed between an origin and 
destination on a packet-switched network such as the Intemet 199. A packet-switching 
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scheme is aa efficient method of handling transmissions on a coimectionless network. 
However, comiection-oriented protocols can be utilized to create a session. A session is a 
series of interactions between two communication end Points that occur during the span of 
a single connection. A detailed discussion of a TCP/IP session is described ia reference to 
S FIG. 4. However, a host can send a message witiiout establishing a coimection with the 
recipient. That is, Hie host simply sends a packet 101 onto the network 199 with the 
destination address and hopes that the packet arrives. 

FIG. 3 illustrates an exemplary TCP/IP packet or datagram 310 and an exemplary 
UDP datagram 340. In a typical TCP/IP packet like 310, each packet typically includes a 

10 header portion comprising an IP header 320 and a TCP header 330, followed by a data 
portion that contains the information to be communicated in the packet. The information 
in the IP header 320 contained in a TCP/IP packet 3 10, or any other IP packet, contains the 
IP addresses and assures that the packet is delivered to the right host. The transport layer 
protocol (TCP) header follows the Internet protocol header and specifies the port numbers 

15 for the associated service. 

The header portion in the typical TCP/IP datagram 310 is 40 bytes including 20 
bytes of IP header 320 information and 20 bytes of TCP header 230 information. The data 
portion or segment associated with the packet 310 follows the header information. 

In regards to a typical IP packet 310, the first 4 bits of the IP header 320 identify 

20 the Internet protocol (IP) version. The following 4 bits identify the IP header length in 32 
bit words. The next 8 bits differentiate the type of service by describing how the packet 
should be handled in transit. The following 16 bits convey the total packet length. 

Large packets tend to be j&agmented by networks that caimot handle a large 
packet size. A 16-bit packet identification is used to reassemble firagmented packets. 

25 Three one-bit set of firagmentation flags control whether a packet is or may be firagmented. 
The 13-bit firagment offset is a sequence number for the 4-byte words in the packet when 
reassembled. In a series of fi^gments, the first offset will be zero. 

Afier the fiugmentation information, an 8--bit time to live field specifies fiie 
remaining life of a packet and is decremented each time the packet is relayed. If fliis field 

30 is 0, the packet is destroyed. Next is an 8-bit protocol field that specifies the transport 
protocol used in tiie data portion. The following 1 6-bit field is a header checksum on the 
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header only. Finally, the last two fields illustrated contain the 32-bit source address and 
32-bit destination address* IP packet data follows the address information. 

In a TCP/IP datagram 310, the initial data of the IP datagram is the TCP header 
330 information. The initial TCP header 330 information includes the 16-bit source and 
5 1 6-bit destination port numbers. A 32-bit sequence number for the data in the packet 
follows the port numbers. Following the sequence number is a 32-bit acknowledgement 
number. If an ACK flag (discussed below) is set, this number is the next sequence number 
the sender of the packet expects to receive. Next is a 4-bit data offset, which is the 
number of 32-bit words in the TCP header. A 6-bit reserved field follows. 

10 Following the reserved field, the next 6 bits are a series of one-bit flags, shown in 

FIG. 3 as flags U, A, P, R, S, F. The first flag is the urgent flag (U). If the U flag is set, it 
indicates that the urgent Pointer is valid and Points to lurgent data that should be acted 
upon as soon as possible. The next flag is the A (or ACK or '^acknowledgment'') flag. 
The ACK flag indicates that an acknowledgment number is valid, and acknowledges tiiat 

15 data has been received. The next flag, the push (P) flag, tells the receiving end to push all 
bxi£fered data to the receiving £q)plicationL The reset (R) flag is the following flag, which 
terminates both ends of the TCP connection. Next, the S (or SYN for ''synchronize") flag 
is set in the initial packet of a TCP connection where both ends have to synchronize their 
TCP buflfers. Following the SYN flag is the F (for FIN or "finish") flag. This flag 

20 signifies that the sending end of the communication and the host will not send any more 
data but still n^y acknowledge data that is received. 

Following the TCP flag bits is a 16-bit receive window size field that specifies the 
amount of space available in the receive bujBFer for the TCP connection. The checksum of 
the TCP header is a 16-bit field. Following the checksum is a 16 bit urgent Pointer that 

25 Points to the urgent data. The TCP/IP datagram data follows flie TCP header. 

Still referring to FIG. 3, a typical User Datagram Protocol (UDP) packet 340 
provides a procedure for application programs to send messages to other programs with a 
minimal of protocol mechanisms. The IP protocol previously described is used as the 
underlying protocol. The UDP protocol is transaction oriented and delivery protection is 

30 not guaranteed. Applications requiring reliable deUvery of data typically use the 
previously described Transmission Control Protocol (TCP). 
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The 16-bit UDP source port is a field to which port a reply, when meaningful, 
should be addressed The 16-bit UDP destination port specifies the server program on the 
receiving host to execute the packet. Next, the 16-bit UDP message lengfii field is the 
length in bytes of the user datagram including header and any data. Following the length 
5 field is the 16-bit checksum of the UDP header, the UDP pseudo header information 350 
fix>m an IP header 320, and the data. 

As will be understood by those skilled in the art, the fimdamental Intemet service 
consists of a packet delivery system. Intemet service is typically considered 
"connectionless" because each packet is treated independently of aU others. Some 

10 transport protocols such as UDP provide unreliable sCTvice because the delivery of tihie 
packet is not guaranteed. Other tran^ort protocols such as TCP provide a mechanism to 
ensure deliv^y of a packet and therefore can be used to establish conq>uter-to-conq>uter 
"sessioiis" in the conventional sense of ttte tenn^ FIG. 4 illustrates a t3^ical TCP/IP 
session and the guaranteed packet delivery mechanism. 

15 As previously stated, the port profiling engine 155 does not analyze the data 

segments of packets for; signature identification^ Instead, the engme 15 5 associates all 
packets with a flow. It analyzes certain statistical data and tracks the associated network 
services. The engine 155 compares recent activity to a predetermined port profile. An 
alarai is geuCTated wh©a a host uses a service that is not in its port profile. 

20 However, in the exemplary onbodiment, the port profiling engine orf^ 

legitimate flows for xmauthorized network usage in order to minimize generated alarms. 
Li a legitimate flow, some data is transmitted and acknowledged^ A discussion of TCP/IP 
flows follows to fiirther illustrate legitimate flows. 

25 T^fritiniflteTTInw 

Turning next to FIG. 4, a TCP session 400 is a fiiU duplex connection that alloi^ 
concurrent transfer of data in both directions. Before the transfer can start, both the 
sending and receiving application programs interact with their respective operating 
systems, informing them of the impending stream transfer. Protocol software 
30 coiamunicates by sending messages across, verifying that the transfer is authorized, and 
indicating that both sides are ready to receive data. 
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FIG, 4 illustrates an exemplary TCP/IP session 400. As discussed in reference to 
FIG. 3, the SYN flag is set whenever one host initiates a session with another host. la the 
initial packet, host 1 sends a message with only the SYN flag set The SYN flag is 
designed to establish a TCP connection and allow both ends to synchronize their TCP 
5 buffers. Hostl provides the sequence ofihe first data packet it will send. 

Host2 responds with a SYN-ACK packet. In this message, both the SYN flag and 
the ACK flag is set. Host2 provides the initial sequence number for its data to Hostl . 
Host2 also sends to Hostl the acknowledgment number which is the next sequence 
number Host2 e}q>ects to receive from host 1. In the SYN*ACK packet sent by Host 2, the 
10 acknowledgment number is the initial sequence number of Host 1 plus 1, which should be 
the next sequence nmnber received. 

Host 1 responds to the SYN-ACK with a packet with just the ACK flag set. Host 1 
acknowledges that the next packet of information received firom Host 2 will be Host 2's 
initial sequence number plus 1. The three-way handshake is complete and data is 
15 transferred. Chily communications in which data is transfeired is considered a legitini^ 
communication. If no data is transferred, tiie communication is considered a probe. 

Host2 responds to ACK packet with its own ACK packet. Host2 acknowledges the 
data it has received from Hostl by sending an acknowledgment number one greater than 
its last received data sequence number. Both hosts send packets with the ACK flag set 
20 until the session is to end although die P and U flags may also be set, if warranted. 

As illustrated, when hostl terminates its end of the session, it sends a packet with 
the FIN and ACK flags set. The FIN flag informs Ho5t2 that no more data will be sent by 
Hostl . The ACK flag acknowledges the last data received by Hostl by informing Host2 
of the next sequence number it expects to receive. 
25 Host2 acknowledges the FIN packet by sending its own ACK packet. The ACK 

packet has the acknowledgement number one greater than flie sequence number of Hostl 's 
FIN-ACK packet ACK packets are still delivered between the two hosts, except that 
HOSTl 's packets have no data appended to the TCP/IP end of tihe headers. 

When Host 2 is ready to terminate flie session, it sends its own packet with the FIN 
30 and ACK flags set Hostl responds that it has received the final packet with an ACK 
packet providing to Host2 an acknowledgment number one greater than flie sequence 
number provided in the FIN-ACK packet of Host2. 
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Alternatively, a host may desire to keep a session active even after if has finished 
sending its current data. If more data is to be sent in the near future, it is more efficient to 
keq> a session open than it is to open multiple s^sions. A session wherein the connection 
is kept open in case future data is to be communicated is typically referred to as a 
5 '"persistent" session. In this scenario, a session is closed by sending a packet with the reset 
flag (R) set (also called a *^eset packet'^ after no data is deUvered after a period of time. 
Many browser applications provide a 300*second window of inactivity before closing a 
session with an R packet (reset). 

The described TCP session 400 of FIG. 4 is a generic TCP session in which a 

10 network might engage. In accordance with the invention, flow data is collected about the 
session to help determine if the commimication is abnormal, hi the prefored embodimoat, 
infomiatidn such as the total number of packets srat, the total amount of data sent, the 
session start time and duration, and Ihe TCP flagg : set in aU of the packets^ are collected, 
stored in &e database 160 and analyzed. 

15 Asexampleof ainon^egitimateflow, aTCP/^ 

theFM flag set would not exist in anomud^communication. Because a packet with both 
the SYN and FIN flags set is mdefined^ each^^^ 

different methods. An operating system may send an ICMP message, a reset, or possibly 
just ignore it and send nothing. Consequently, an intruder nmy send a SYN-FIN packet 

20 specifically to help identify the operating system of tiie targeted host 

As another example, if a particular host sends a large numb^ of SYN packets to a 
target host and in response receives numerous R packets firom the targeted host, a potential 
TCP probe is indicated. Likewise, numerous UDP packets sent Scorn one host to a targeted 
host and numerous ICMP ^'port unavailable" packets received from the targeted host 

25 indicates a potential UDP probe. ■: A stealth probe is indicated by multiple packets from the 
same source port number sent to diffaent port numbers on a targets 

As has been described elsewhere, UDP packets are often used in connection with 
streaming media and other ^phcations tiiat provide data to many hosts. A UDP packet 
with no appendedrdata does not occur in nonnal conmiunic^ons^ In fact, a flow with 

30 numerous S YN packets with numerous S YN-ACK r^onses may indicate a half-open 
attack designed to tie up the targeted host's ports and resources. From the foregoing, it 
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will be understood and appreciated that an analysis of legitimate flows will not include 
flows without some data transfer. 

Network Services 

5 A single networic service is typically associated with a particular port on a server, 

and is also associated with a port on a client machine; port numbers are typically fixed in 
server machines such as host U2 server 130 (FIG. 1) but typically vary in client noiachines 
such as ho5t#l client 110. However, the port profiling engine 155 associates in the host 
data structure 1 66 both the client host and the serv^ host with the server port number 

10 because that port generally represents the network service being utilized. 

FIG. 5 is an illustration of some common flows associate with some common 
network services. As is known, each host has its own imique IP address. IP addresses are 
typically referred to by four sets of numbers separated by periods, e.g. N.N.N J^, where N 
varies between 0 and 255. Also as described, assigned port numbers of the server 

15 delineate the services provided by that server; port numbers in present-day systems vary 
between 0 and 65,536. 

The client is illustrated with an IP address of ADDRESSl while the server is 
illustrated ynUx TP address ADDRESSO. As illustrated in the example, three separate 
searvices-HTTP, SMTP, and FTP-are being invoked by the client A Web browser 

20 application (not shown) running on the client machine utilizes the Hypertext Transfer 

Protocol (HTTP), an email application (also not shown) utilizes the Simple Mail Transfer 
Protocol (SMTP), and a file transfer application program (not shown) utilizes the File 
Transfer Protocol (FTP). 

The first flow illustrated would be Web traffic (HTTP protocol) between the client 

25 at IP ADDRESS 1 and the server at IP ADDRESSO. The client Web browser opens a 

random ephemeral high port (51,132) as illustrated in the example. A high port is utilized 
because the low port numbers less than 1024 are preassigned for designated services. One 
these designated services is port 80 for HTTP, which transfers displayable Web pages and 
related files in the known manner. The Web browser sends the request to the server's port 

30 80. The server port responds by sending the requested Web page data in packets wherein 
the port number in the packets transmitted to the client sets the destination port to 51,132 
of the client. All conmixmications by clients utilizing HTTP is sent to port 80 of the 
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server. One flow woiild be the HTTP commimications between port 51,132 of 
ADDRESS 1 and port 80 of ADDRESSO. However, the port profiling engine wiU jxpdate 
ADDRESS 1 as a client utilizing port 80 and ADDRESSO as the server utilizing port 80 in 
the "seen today** structure. The port profiling engine compares fhe port profile of each 
host with the current activity for that host to determine unauthorized network usage. 

In accordance with an aspect of the invention, a flow is tmninated if no 
communications occur between flie two IP addresses and the one low port (e.g. port 80) 
for 330 seconds. Most Web browsers or a TCP connection send a reset packet (i.e. a 
packet with the R flag set) if no communications are sent or received for 5 minutes. 

The next flow illustrated is email traffic between the client and server utilizing 
server port 25. The client email ^>plication opens a random high ephemeral port, e.g. port 
49^948 as illustrated in FIG. 5. The client's email application sends the email utilizing the 
Simple Mail Transfer Protocol(SMTP) to the server's port 25. Port 25 is conventionally 
designated for SMTP communications. A flow is tennixiated if no communications are 
delivered between the two IP addresses and the low port for 330 seconds. If the client 
sends another SMTP email packet or packets within 330 seconds of the end of the first 
CTiail to tiie server, only one flow would exist. 

For example, as ^own in FIG. 5, if a seo^hd email packet originating fix^m the 
ephemeral port 35^620 is smt within 330 seconds;, only one flow would exist. If the 
second email packet was later than 330 seconds firom the first sent email, it would be 
classified as another flow for analysis purposes. Again, port profiling engme will update 
ADDRESS! as a client utilizing port 25 and ADDRESSO as the server utilizing port 25 in 
the "seen today** structure. The port profiling engine compares the port profile of each 
host with the current activity for that host to determine unauthorized network usage. 

As is well known, the File Transfer Protocol (FTP) is the simplest method to 
exchange files between hosts on the Internet A client begins a session by sending a 
request to communicate to port 21 of designated server machine. The client also includes 
a second port number to be used whm data is exchanged. The server initiates the 
exchange from its own port 20 (FTP DATA) to the port designated by the client, port 4993 
as illustrated in FIG. 4. In the 'passive" mode, FTP uses one or more TCP connections 
using two ephemeral ports. The port profiling engine treats these secondary connections 
as part of *TCP port 21" operation. The port profiling engine compares the port profile of 
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each host with the ciurent network service for that host to determine unauthorized network 
usage. 

The collected flow data is analyzed to determine the associated netwoik service 
provided. A host data structure is maintained containing a profUe of the network services 
5 normally associated with the host. If the observed netwoik service is not one of the 
normal netwoik services performed as defined by the port profile for that host, an alarm 
signal is geaerated and action can be taken based iq)on the detection of an Out of Profile 
network service. An Out of Profile operation can indicate the operation of a Trojan Horse 
program on the host, or the existence of a non-approved network appUcation that has been 
10 installed. 

Port profiling engine 

FIG. 6 illustrates a logical software architecture of a port profiling engiae 155 
constructed in accordance with an embodiment of the present inventioiL As will be 

15 understood by those skilled in the art, the system is constructed utilizing l2Xtemet--enabled 
computer systems with computer programs designed to carry out the functions described 
herein. Preferably, the various computing functions are implemented as diffi^:mt but 
related processes knovm as ^threads" which executed concurrmtly on modem day multi- 
threaded, multitasking computer systems. 

20 The computer programs or threads are executed on a computer system 800 

constructed as described in reference to FIG. 8, which illustrates a suitable exemplary 
computer system that may be utilized to construct a monitoring appliance 1 50 including a 
port profiling engine 155, or a separately implemented port profiling engine. Although the 
described embodiments are generally described in reference to an Internet-accessible 

25 computer S3^em that is dedicated to implementing the engine 155, those skilled in the art 
will recognize that the present invention can be implemented in computer program code 
that can execute in conjunction with other program modules in various types of general 
purpose, special purpose, or dedicated computers. Accordingly, it wiU be understood that 
flie terms "computer," "operating sj^em," and "application program" include all types of 

30 computers and the program modules designed to be implemCTted by the computers. 

The discussion of mefiiods fliat follow, especially in the software architecture, is 
represaited largely in terms of processes and symboUc representations of operations by 
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conventional computer components, including a central processing unit (CPU), memory 
storage devices for the CPU, network communication interfaces^ co3mected display 
devices, and mput devices. Furthermore, these processes and operations may utilize 
conventional computer components in a heterogeneous distributed computing 
5 environment, including remote file servers, remote computer servers, and remote memory 
storage devices. Each of these conventional distributed computing components is 
accessible by the CPU via a communication netwoik. 

The processes and operations performed by tiie computer include the manipulation 
of signals by a CPU, or remote server such as an Intemet Web site, and the maintenance of 

10 these signals within data structures reside in one or more of the local or remote mCTioiy 
storage devices. Such data structures impose a physical organization upon the collection 
of data stored within a memory storage device and represent specific electrical, optical, or 
magnetic elements. These symbolic representations are the means^used by tiiose skilled in 
the art of conq)uter programming and computer construction to effectively convey 

15 teachings and discoveries to others skilled in the art. For the purposes of this discussion, a 
process is understood to include a sequence of comqputer-executed steps leading to a 
concrete, useful, and tangible result, namely^ the detection of unauthorized networic usage 
based iipon network service activity. 

These steps generally require manipulatioiis of qiiantities such as IP addresses, 

20 packet length, header length, start times, ^d tinies, port numbers, and other packet related 
information. Usually, though not necessarily, these quantities take the form of electrical, 
magnetic, or optical signals capable of being stored, transferred, combined, compared, or 
otherwise manipulated. It is conventional for those skilled in the art to refer to these 
signals as bits, bytes, words, values, elements, symbols, characters, terms, numbers, 

25 Points, records, objects, images, files or the like. It should be kept in mind, however, that 
these and similar terms should be associated with £q>propriate quantities for conoputer 
operations, and that these terms are merely conventional labels applied to quantities that 
exist within and during operation of the computer^ 

It should also be understood that manipulations within the computer are often 

30 referred to in terms such as displaying, deciding, storing, adding, comparing, moving, 
positioning, placing, and altering which are often associated with manual operations 
performed by a human operator. The operations described herein include machine 
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operations perfonned in conjunction with various input provided by a human operator or 
user that interacts with the computer. In addition, it will be understood that the programs, 
processes, routines and methods desoibed herein are not related or limited to any 
particular computer or apparatus, nor are tiiey related or limited to any particular 
5 conmiunication network or computer architectures. Rather, various types of general- 
purpose machines may be used with program modules constructed in accordance with the 
teachings described herein. Similarly, it may prove advantageous to constract a 
specialized apparatus to perform the method steps described herein by way of dedicated 
computer systems in a specific network architecture with hard-wired logic or programs 

10 stored in nonvolatile memory, such as read only memory. 

With the foregoing in mind, the drawing figures starting with FIG- 6, and the 
accompanying appendix of computer program code, illustrate various fimctions, processes, 
or routines carried out by an CTobodim^t of the present invention. It will also be 
understood that the processes and methods presented here may be arranged differently, or 

15 steps taken in a different order. In other words, some processes and methods may be 
deleted, repeated, re-ordered, combined, or blended to form similar processes and 
methods. 

FIG. 6 illustrates the operation of the preferred port profiling engine 155. The 
engine stores data fix>m its operations in a database 160, which in the disclosed 

20 embodiment coinprises two data stractures — one used to collect statistics on data flows 
(flow data stmcture 162) in progress, and another to accumulate date on the host 
computers (host data structure 166) involved in those flows. According to the 
embodiment, the port profiling engine 155 uses tiiree main threads or processes that read 
and write these data structures to identify possible unauthorized network usage, which are 

25 identified as Out of Profile network services. These threads are a packet classifier thread 
610, a flow collector thread 620, and an alert manager thread 630. The threads also 
identify the client and server network applications that are being operating by the hosts 
that are observed participating in the flows observed (port profiling). 



30 



Packet Classifier 

The header data is read by the packet classifier thread 610. The packet classifier 
tiiread 610 runs whenever new packet information is available. Based on the sovirce and 



wo 02/061510 



PCT/US02/02871 



destination IP addresses, the thread 610 searches for an existing flow in the flow data 
structure 162. To facilitate searching and record insertion, a synunetric hash of the two IP 
addresses is generated and used as the index of an array that points to the beginning of a 
two-way linked list of all flows with that hash value. As known to those skiUed in the art, 
5 a symmetric hash is a mathematical process that creates a probabilistically unique number 
that facilitates r^id indexing and sorting within a data structure such as flow data 
structure 162. 

Flow processing is done for TCP and UDP packets, and the port numbers in the 

transport layer header are used to identify the flow record to be updated. For ICMP 
10 packets that constitute rejections of a packet, the copy of the rejected packet in the ICMP 

data field is used to identify the IP addresses and port numbers of the corresponding flow. 
For purposes of the description which follows, the IP address with the lower value, 

when considered as a 32-bit unsigned integer^ is designated ip[0] and the corresponding 

port number is designated pt[0]. The higher address is designated ip[l} and the 
15 corresponding TCP or UDP port number is designated pt[l]. At some point, eiflier pt[0] or 

pt[l] may be designated the "servef port by setting an ^propriate bit in a bit map that is 

part of flie flow record (record "state", bit 1 or 2 is set). 

If a particular packet 101 being processed by the packet classifier 610 matches a 

particular entry or record in the flow data structure 162, data fiom that particular packeit 
20 101 is used to update the statistics in the corresponding flow data structure record. A 

packet 101 is considered to match to a flow data structure record if both IP numbers match 

and: 

a) both port numbers match and no port is marked as the "server^' port, or 

b) the port number previously marked as tiie "server^* port matches, or 

25 c) one of the port numbers matches, but the oihsac does not, and the neither port 

number has been marked as the server port (in this case the matching port number 
is marked as the "server^' port). 

If no prior data record exists in the flow data structure 162 that matches the current 
packet, a new flow data record is created in the flow data structure 162 using flxe IP 
30 addresses and port numb^ firom the current packet, and is linked to tiie end of the 

appropriate linked list of flow records. The tune that flie flow started, i.e. the first packets 
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capture time, is written into the record as the "start" time, in a predetermined field of the 
data record. 

The time of each packet is written into the record "last", overwritmg the previous 

value. 

5 

Flow Data Structure 

The preferred flow data structure 162 has a plurality of different fields in each 
record. Since packet header information is analyzed for port profiling, other flow 
mformation can be accumulated from the packet header data for analysis. The preferred 
10 flow data structure (in the known C programming language) is as follows, where the index 
shown as [2] (0 or 1) is "0" if the packet source is the host ip[0], "1" otherwise (e.g. if the 
packet source is ip[l], then the packet bytes are added to bytes[l], pkts[l] is incremented, 
etc.): 



15 #define SLOTS 131073 //no. flows in data table 
stmct flow_db { 

unsigned long ip[2] ; // ip[0] - lower ip address - ip[l] - higher ip address 
unsigned short pt[2] ; //top or udp ports, pt[0] and pt[l] 
unsigned short service ; // port number of server 
20 unsigned long down ; // linked list index 

unsigned long up; // linked list index 
imsigned long start ; // time Flow started 
unsigned long last ; // time Flow ended 

unsigned long state ; // Server =0, 2 or 4, UDP = 1 (Server Port Marked) 
25 unsigned long bytes[2] ; // bytes sent by ip[0] and ip[l] 

unsigned long pkts[2] ; // packets sent by ip[0] and ip[l] 
unsigned long flgs[2] ; // bitmap of all TCP flags seen 
unsigned char flag[2][7];//0 bad, 1 reset, 2 urgent, 3 syn, 4 syn-ack, 5 fin, 6 

firagments, // (counts of packets seen with various TCP flag combinations) 
30 -7 UDP rcgects 

unsigned short scans ; // max number ports seen for ip pair, detects *Tort Scans" 
} flow[SLOTS] ; 
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Notice tiiat many of the fields are counters for each host, e.g., the number of packets and 
bytes sent, the number of packets with various TCP flag-bit combinations sent for TCP 
flows, the number of ICMP ""port-unavailables" for a UDP flow. Also bitm^s can be 

5 filled in, such as the bitmap of all TCP flags seen which has been bitwise OR'ed with the 
TCP flag field of each TCP packet. Data is filled in for the source (origiiiating) host 
Tlie packet classifier fliread 610 also adds some data directly to the host data 
structure 1 66. Most of this data could be added later by the flow collector thread 620 
(such as bytes sent by each host), but adding it on a packet by packet basis allows 

10 collection of real time rate information (such as bytes sent in each time interval). These 
records are indicated in the host data structure 166 below. 

Host Data Structure 

The host data stmcture 166 accumulates data on all hosts that are observed 
15 participating in a flow. A description of this data structure in C language format follows: 

#define HOST^SLGTS 65537 // number Host slots 
struct host_db { 
// data added by the Packet Classifier Thread 
20 unsigned long ip ; //ip address 

unsigned long down ; // linked list index 

unsigned long up; // linked list index: 

unsigned long start ; // time host record started 

unsigned long last ; // time of last packet firom this host 
25 unsigned long udp_bytes; // UDP bytes sent and recdved 

unsigned long bytesjn ; // bytes received 

imsigned long bytes_injpp ; // Bytes over last 5 min interval 

unsigned long bytes_in jtnx ; // max all day 

unsigned long pkts_in ; // packets received 
30 unsigned long bytes_ot; // for Web_alert period 

unsigned long bytes_ot_pp ; // Bytes sent over 5 min interval 

unsigned long bytes_ot_mx ; // max bytes in 5-min interval all day 
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unsigned long pkts_ot ; // packets sent 
unsigned long resets ; // TCP Reset packets received 
unsigned long rejects ^ ; // icnq) 'port unavailable* packets received 
unsigned long badjpkts ; // SYN-ACK, and any other non-standard 
5 combination // data added by tiie Host Collector Thread 

ujxsigned long server ; // 32 conunon server ports — seen today 
unsigned long client ; // 32 common client ports - seen today 
unsigned long sjprofile ; // 32 common server ports -(in 

profile), predefined for common TCP and UDP services 
10 unsigned long c_profile ; // 32 common client ports -(in 

profile) 

unsigned short s_list[ODDJMAX] ; // list of uncommon (odd) servers 
with bits to indicate the same info as the preceding four bit m^s 

unsigned short c_list[ODD_MAXl ; // list of uncommon (odd) clients with 
15 bits to indicate the same iiifo as the preceding four bit inq>s 

unsigned long s_flows ; // Server in this many flows 

unsigned long c_flows ; // Client in this many flows 

unsigned long pmgs ; //pings 

unsigned long traces ;// traceroutes run 
20 imsigned long concern ; // accumulated CI 

// bits set by both tiireads to record '"Alert Messages" such as "Bad 
TCP Flags". 

unsigned long alerts ; // bit m^ of alert conditions 
}host[HOST_SLOTS] 

25 

Flow Collector Thread 

The flow collector thread 620 runs periodically (e.g., every five minutes) and 
searches linearly through the entire flow data structure 162 to find flows that have been 
inactive for a certain time period (e.g., 6 minutes). These flows are considered as finished 
30 and a logic-tree analysis is done to classify them as either a normal flow, or a potential 
probe or other suspicious activity. 
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Nonnal flows are those for which the corresponding statistics indicate a normal 
exchange of ioformation between two hosts. The host that initiated the flow is considered 
the client (i.e. the consputer that sent TCP SYN packets or sent an initial UDP packet). 
Hie other host is considered the server (i.e. the computer that sent TCP SYN-ACK packets 
5 or responded to a UDP packet). Some data is exchanged during a normal flow. 

A potential probe is a flow that appears to have one host (a possible intruder) 
sending packets to gain mformation about another host (an intended victim). An example 
of a potential probe is a flow that has TCP packets of any sort sent by one host (the 
intruder) and approximately the same number of TCP reset packets sent by the other. 

10 Another example is a flow which has UDP packets answered by ICMP '^ort unavailable'* 
packets. A flow with ICMP "destination unreachable" packets seat by one host would be 
considered a potential probe being done by the other host. 

After the flow is analyzed^ the liost data structure is iq)dated. The port display, as 
illustrated in FIG. 7, reflects the iq>dat6d informati<m. A black diamond indicator 

15 indicates that &e service has been previously seen today aiid is an au 

service. A gray diamond or a diamond witii an internal cross for non color displays 
indicates the service is seen for the &st time today and is an authorized service. A red 
diamond shown as a clear diamond in a non-color display is an alert indicating an Out of 
Profile network service has been observed. 

20 Additionally, after the flow has been analyzed, the flow record is then written to 

the flow log file and then cleared from the flow data structure. 

Alert Manager Thread 

The alert manager thread 630 runs periodically (e.g., foUowing the flow manager 

25 thread 620) and does a linear search through the host data structure 166. As it does so, it 
compiles a number of lists that are written to various ou^ut files for use by user interface 
programs, i.e. programs tiiat report information firom the operation of the port profiling 
engine 155 of ^>pliance 150* 

For example, the alert manager thread 630 preferably generates a profile display 

30 646 of hosts that Out of Profile network services used as illustrated in reference to FIG. 7. 
A screen to display for the network administrator a list of aU "inside hosts" (hosts in the 
network of concem) which shows (using color coding) ports that are in the port profile 
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(gray if not seen today, black if seen today) and Out of Profile (red, seen for the first time 
today). If a range, or set of ranges, of IP addresses have been defined by the network 
administrator as *'inside addresses," separate lists can be generated for "inside" and 
"outside" hosts. Numerous other queries and reports 648 can be generated for review and 
5 anal}^is by a network S3^tem administrator. . 

The packet classifier thread 610 collects information on network operations such as 
packets and byt^ on a per-second, per-minute, and per-hour basis. This information is 
collected on all packets and on certain categories of packets such as TCP and UDP and 
subsets of these based on port number. Histograms of packet size and TCP or UDP port 

10 numbers are also collected. The alert manager thread 630 writes the updated data to 
various output files for use by the user interface, or for later off-line analysis. 

The alert manager 630 looks for hosts whose network usage indicates Out of 
Profile network services. The new alarm conditions can cause immediate operator 
notification by an operator notification proc^ 642. These conditions can be highlighted 

15 on tibie user interface, and cause SNMP trap messages to be sent to a network monitor such 
as HP^ Openview, and/or email messages to the network administrator which in turn may 
cause messages to be seat to beepers or cell phones. Messages can also be sent to cause 
automated devices such as a firewall manager 644 to drop packets going to or firom an 
ofEimding host It will thus be appreciated that the present invention advantageously 

20 operates in conjunction with firewalls and other network security devices and processes to 
provide additional protection for an entity's computer network and computer resources. 

Profile Display 

FIG. 7 illustrates an exemplary screen display 700 of the service profile for the 
25 network administrator. The display 700 includes a list of all "inside hosts" (hosts in the 
network of concern) which shows (using color coding) ports that are in the port profile 
(gray if not seea today, black if seen today) and Out of Profile (red, seen for the first time 
today). If non-color display is utilized, the gray indicator can be illustrated as a crossed 
diamond and a red indicator as a clear diamond. A range, or set of ranges, of IP addresses 
30 can be defined by a network administrator as ^Mnside addresses." 

Display 710 shows each inside host IP address 721-726 and associated common 
730 and non-standard 735 server network services 730 in which the host acted as a SCTver. 
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Additionally, the display 750 shows each inside host IP address 761-766 and associated 
common 760 and non-standard 755 client network services in which the host acted as a 
client. Each service seen or used in the last 24 hours is indicated by a (tiamond. A black 
diamond 782 indicates the service is in profile and has been previously observed today. 
5 For example, indicator 741 is a black diamond indicating the IP address 066.065.047.148 
has acted as a server previotxsly today using the HTTPS service and the HTTPS server 
service is in profile for this host. A gray (or crossed) diamond 784 indicates the service is 
in profile and has been not previously observed before today. For example, indicator 741 
indicates the IP address 066.056.047.148 has sent an email using the SMTP for the first 

10 time today and the host is authorized to use SMTP as a client. A red diamond (or clear) 
786 indicates that the obsCTved service is Out of Profile for that host. For example, 
indicator 777 is a red diamond indicating the host has used an Out of Profile service. In 
this case, host 066.056.079.001 has acted as a client using the HTTPS service, which is 
not in this host's service usage profile. 

15 The display 700 shows the common or standard network server services 730 and 

network client services 760 selected by the network administrator for presentment In 
addition, the non standard server services 735 and non standard client services 755 are 
displayed. The following table provides a description of the services displayed and the 
associated port nimibers for selected but exemplary standard network service: 

20 



Label on 


Service Name 


UDP and TCP Port 


Column 




Nimibers Included 


DNS 


Domain Name Service - UDP 


UDP 53 


DNS TCP 


Domain Name Service - TCP 


TCP 53 


HTTP 


Web 


TCP 80 


HTTPS 


Secure Web (SSL) 


TCP 443 


SMTP 


Simple Mail Tnmsport 


TCP 25 


POP 


Post Office Protocol 


TCP 109, 110 


SNNCP 


Sinqile Network Managemmt 


TCP 161,162 UDP 161,162 


TELNET 


Telnet Terminal 


TCP 23 


FTP 


File Transfer Protocol 


TCP 20,21 


SSH 


Secure Shell (terminal) 


TCP 22 


AFP IP 


Apple File Protocol/IP 


TCP 447, 548 
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NETBIOS 


NetBIOS (Windows) 


TCP 137-139, UDP 137-139 


FNGR 


Finger 


TCP 79 


NEWS 


Usenet, Network News 


TCP 119 


M*cast 


Multicast 


IP addresses 224.0.0.0- 
239255J255JL55 


B*cast 


Broadcast 


IP addresss 255.255.255^55 
or [net^suhnet,-!] 



The non-standard services arc those netwoilc services that are not the most commonly seen 
services utilizing either the TCP or UDP transport layer protocoL In addition, if other 
transport layer protocols are seen other than TCP, UDP, or ICMP, these protocols are 
5 treated as non-standard networic services. As described in reference to FIG. 3, there are 
251 otherprotocols that can be designated by the "next protocol" field in flie IP header 
besides the three common Internet Transport Layer protocols (ICMP, TCP, and UDP), 
There are generally no port nxmibers with these other protocols, so these other protocols 
are indicated simply by the '"protocol number." The Internet Control Message Protocol 
10 (ICMP), referred to as protocol number 1 , is not tracked because any host using IP would 
be using ICMP. The host using these protocols are all shown as '^servers" since many of 
the protocols are peer-to-peer and the transport protocol number incticates the service. 
Both the conunon (standard) and non-standard netwoik services are discussed in greater 
detail in reference to FIG. 2. 

15 

. Hardware 

A preferred hardware configuration 800 of an embodiment that executes the 
functions of the above described port profiling engine is described in reference to FIG. 8. 
FIG. 8 illustrates a typically hardware configuration 800 for a port profiling detection 
20 system. A monitoring appliance 150 serves as a pass-by filter of network traffic. A 

netwoik device 135^ such as a router, switch, hub, tsq>, or the like, provides the location for 
connecting the monitoring appliance 150 to the network 899 for monitoring the netwoik 
traffic. 

As illustrated, the monitoring appliance 150 is preferably configured with two 
25 netwoik iater£ace cards (NIC) 830 such as 3COM brand model 932 10/100 MHz adapters 
or other ads^ters to match the netwoik. However, it should be apparent to one skiUed in 
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the art that one or more cards can be utilized to accomplish the functions of the presently 
described dual card system. The monitor NIC 834 is typically set to a pronuscuous mode 
or a similar function. The promiscuous mode is a mode of operation in which every data 
packet passing through the netwoik device 135 will be received and read. An admin NIC 
5 838 allows network inter&dng and handles commands sent fix>m the monitoring ^>p]iance 
135. A NIC driver 820 enables the network traffic data to be exchanged with the 
processor 850. Other drivers 825 are utilized to interface or communicate with other 
devices including peripherals. These peripherals include keyboards, monitors, printers, 
storage devices, and other input/output devices. As one skilled in the art will appreciate, 

10 such drivers are typically packaged with the system. 

The operating system 810 for the computer 800 preferably needs to be compatible 
with the hardware of the monitoring appliance 150. One operating system 810 that can be 
utilized is the operatmg system referred to as LINUX. One skilled in the art will 
appreciate that other qperating systems may be readily substituted. As is known to those 

1 5 skilled in the art, the qporadng system of a computer controls the operation of the 

processor 850. The processor 850 interfaces with the memory 805 to execute programs. 
Preferably, tibe monitoring £^liance will have 128 megabytes or more of memory. 

As discussed in reference to FIG. 5, the processor 850 executes the packet 
classifier thread 610, the flow collector thread 620^ and the al^ manage thread 630. 

20 These threads interact with flow data structure 1 62 and the host data structure 1 66, as 

described. The data structures provide tOTiporary storage of information. As discussed in 
reference to FIG. 6, a log file is maintained on the hard drive 840 for forensic analysis, if 
desired. 

In the exemplary embodiment, an auto configure module 870 is executed by the 
25 processor 850. The auto configure module 870 is operable to enable the port inofile host 
data 166 for each host to be automatically generated as discussed in reference to FIG. 2. 
Additionally, this embodiment can includes an profile editor 880. The processor 850 
executed the profile editor enabling the modification of the port profile host data 166. 



30 



Flowcharts 

Refer now to FIG. 9 for a discussion of the steps of the preferred packet classifier, 
flow collector, and alert manager threads. As previously discussed ia reference to FIG. 6, 
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the preferred port profiling engine 155 comprises three operational threads or processes 

that execute within a system or appliance that implements an embodiment of the 

invention. The packet classifier thread 610 classifies packets into their associated flow 

and updates the flow records. The flow collector thread 620 determines a termination of a 
5 flow, perfomis a logic tree analysis to classify the flow, and determines the port usage, and 

updates the port display. Finally, the alert manager thread 630 generates reports and alarm 

signals if an alarm threshold is exceeded. 

In FIG. 9A, die.flow classifier thread 610 begins with step 912. In step 912, the 

thread 610 determines if a new packet is available. If a new packet is not available, the no 
10 branch of step 912 is followed to step 912, in which the thread 61 0 awaits a new packet. If 

a new packet is available, the yes branch of step 912 is followed to step 914, in which the 

thread determines if the packet belong to a new flow. 

As discussed previously, the header data if each packet processed is read by the 

packet classifier thread 610. Based on the source and destination IP addresses, the thread 
15 610 searches for an existing flow in the flow data structure 162, which is embodied as a 

data array in memory. A symmetric hash of the two IP addresses is used as the index into 

the array that points to the beginning of a two-way linked list of all flows with that hash 

value. 

Flow processing is done for TCP and UDP packets, and the port numbers in the 
20 transport layer header are used to identify the flow record to be updated. For ICMP 

packets that constitute rejections of a packet, the copy of the rejected packet in the ICMP 
data field is used to identify the IP addresses and port numbers of the corresponding flow. 

A packet 101 is considered to match to a flow data structure record if both IP 
numbers match and: 

25 a) both port numbers match and no port is marked as the ^'server*' port, or 

b) the port number previously naarked as the ^'server'' port znatches,^^ 

c) one ofthe port numbers matches, but the other does not, and the neither port 
number has been marked as the server port (in this case the matching port number is 
marked as the "server^' port). 

30 If anew flow is determined, the yes branch of step 914 is followed by step 916. In 

step 916, a new flow record is created. If no flow exists tiiat matches the current packet, a 
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new flow record is started using the IP addresses and port numbers from the current 
packet, and is linked to the end of the appropriate linked list of flow records. 

The IP address with the lower value, when considered as a 32-bit unsigned integer, 
is designated ip[0] and the corresponding port number is designated pt[0]. The higher IP 
5 address is designated ip[ 1] and the corresponding TCP or UDP port number is designated 
pt[l]. At some Point, either pt[0] or pt[l] may be designated the '^server^' port by setting a 
the impropriate bit in a bit map that is part of the flow record (record '"state", bits 1 or 2 
set). 

Step 916 is followed by step 918, in which the flow records in the flow data 
10 structure 162 are updated. The time that the flow started, the packet C2Q)ture time, is 
written into the record "start." The flow data structures updated by the packet classifier 
thread is discussed in detail in reference to FIG. 6. Step 918 is returned to step 912, in 
which the thread 610 determines if a new packet is available. 

Referring next to FIG. 9B, the flow collector thread 620 begins wifli step 942. In 
15 step 942, the thread 620 determines if a periodic time has elapsed, e.g. S minutes in the 
disclosed embodiment. If the requisite time period has not elapsed, the no branch of step 
942 is followed to step 942, in which the thread 620 awaits the time to elapse. 

If the time has closed, the yes branch of step 942 is followed to step 943, in which 
the thread 620 performs an inactivity search. The flow collector thread 620 runs 
20 periodically (e.g., every five minutes) and searches linearly through the entire flow data 
stracture 162 to find flows that have been inactive for a certain time period (e.g., 6 
minutes, although this time is arbitrary and may be heuiistically determined). These flows 
are considered finished. 

St^ 943 is followed by stqp 944. In step 944, a logic-tree analysis is done to 
25 classify ihsm as ^ther a normal flow or as a potential probe. Normal flows are those 

whose statistics indicate a nomml exchange of information between two hosts. Preferably, 
the host that initiated the flow is consid^ed the cli^t (sent TCP S YN packets or sent the 
mitial UDP packet). The oflier host is considered the server (sent TCP SYN-ACK packets 
or responded to a UDP packet). Some data is exchanged during a normal flow. 
30 As will be recalled, one exemplary indication of a potential probe is a flow that 

appears to have one host (the intruder) sending packets to gain information about another 
host (the victim). An example of a potential probe is a flow that has TCP packets of any 
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sort sent by one host (the intruder) and approximately the same number of TCP reset 
packets sent by the other. Another exanq)le is a flow which has UDP packets answered by 
ICMP **port miavailable" packets. A flow with IC!MP "destination unreachable" packets 
sent by one host would be considered a potential probe being done by the other host. 
5 Step 944 is followed by step 945, in which a service port number is assigned to 

each flow. Each host is assigned a service port number, a protocol such as TCP, UDP or 
other, and type of operation which is eitiier network service usage as a client or server. 

Step 945 is followed by step 946. In step 946, the flow record is written to the 
flow log file. In addition, the host data structure is updated to reflect the observed 

10 services. Additionally, the port display is updated to indicate whether the port has been 
previously seen today and whether the service is in profile for that host. 

Step 946 is followed by step 947. hi step 947, the flow record is cleared firom the 
flow data structure. After step 947, the thread is returned to st&p 942, in which the thread 
awaits for the requisite time. 

15 Referring next to FIG. 9C, the alarm manager thread 630 begins with step 972* Ih 

step 972, the thread 630 determines if a periodic time has elapsed. If the requisite time 
period has not elapsed, the no branch of st^ 972 is followed to step 972, in which the 
thread 630 awaits the time to elapse. 

If the time has elapsed, the yes branch of step 972 is followed to step 973, in which 

20 the thread 630 performs port profile search. The alert manager fluread 630 runs 

periodically (e.g., following the flow manager thread 620) and does a linear search 
through the host data structure 166. 

Step 973 is followed by step 974. hi step 974, the port profiling engine 155 
compiles a number of lists that are written to various output files for use by the user 

25 interface programs. If a range, or set of ranges, of IP addresses have been defined by the 
network administrator as **inside addresses," separate Usts can be generated for ^'inside" 
and "outside" hosts. Numerous other queries and reports 648 can be generated for review 
and analysis by ttie network administrator. The alert manager thread 630 writes the 
updated data to various output files for use by the user interface, or for later ofif-line 

30 analysis. 

Step 974 is followed by step 975, in which the thread 630 determines services if an 
observed service is Out of Profile by comparing observed network usage with the allowed 
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network services. If an observed service is in profile, the no branch of step 975 is returned 
to perform step 972. In step 972, the thread 630 deteimines if a requisite time period has 
elsqpsed. 

If a service is found to be Out of Proffle, the yes branch of step 975 is followed to 
5 step 976. In stq> 976, the alert manager thread generates certain predetermined signals 
designed to drawn the attention of a system administrator or other interested person. These 
conditions can be highlighted on the user interface, and cause SNMP trap messages to be 
sent to a network monitor such as HP Openview, and/or email messages to the network 
administrator which in turn may cause messages to be sent to beepers or cell phones. 

10 Messages can also be sent to cause automated devices such as a firewall manager to drop 
packets going to or from an offending host. Step 976 is followed by step 972, in which 
the thread 630 awaits the requisite amoimt of time. 

In view of the foregoing, it will be ^preciated that the present invention provides a 
port profiling system that is robust, scalable, efficient, and overcomes various problems 

15 with convendonal signature^based or pure anomaly-based detection systems. It ^ould be 
understood that the foregoing relates only to the exenq>lary embodiments of the present 
invention, and that numerous changes may be made thereia without departing from the 
spirit and scope of the. invention as defined by the following claims. Accordingly, it is fiie 
claims set forth below, and not merely the foregoing illustration, which are intended to 

20 define the exclusive rights of the invention. 



Industrial Appucations 
The port profiling system efficiratly and reliably monitors networic traffic for 
unauthorized network usage with the abiUty to be scaled to large traffic flows. 
25 Consequently, the port profiling engine has applicability in the fields of network 

monitoring, network security, netwoik devices, network communications, and similar 
fields. 
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CLAIMS 

What is claimed: 

5 1 . A mefhod for detennining unaufhorized network usage, comprising the steps of: 
capturing packet header information from communications on a network; 
determining valid connections or data flows; 

detemiining hosts on the network that act as a client and server for each vaUd 
connection or data flow; and 
10 determining network services being used by every host in a predefined group of 

hosts. 

2. The method of claim 1, further comprising the stqp of displaying indicia indicating 
observed network services during a monitoring period. 

15 

3. The method of claim 2, frirther comprising the step of displaying an indication of the 
observed network services which were previously seen during the presentmCTt period. 

4. The method of claim 1, further comprising the steps of: 
20 storing an allowed network services profile; 

comparing allowed network services with observed network services; and 
generating an alami when an observed network service is not an allowed network 
service. 

25 5. The method of claim 3, fiuther comprising the step of displaying indicia indicating 
whether the observed network services is not an allowed networic service. 

6. The method of claim 4, further comprising the step of building a network service 
profile based upon network services observed during a profile generation time period. 

30 



7. The method of claim 4, fiirther comprising editing the allowed network services profile. 



wo 02/061510 PCT/US02/02871 



8. The method of claim 4, fbrther comprising the step of editing the allowed network 
services profile for a block of network address. 

9* A method for determining unauthorized network usage, comprising the steps of: 
5 capturing packet header information firom communications on a networic; 

determining hosts on the network that act as a client and server for each valid 
connection or data flow; 

determining network services being used by every host in a predefined group of 
hosts; and 

10 generating an alarm upon an observed network service not being included in a, 

allowed network service profile. 

10. A method for determining unauthorized network usage, comprising tiie steps of: 

capturing packet header information firom communications on a networiq 
1 5 determining valid connections or data flows; 

storing an allowed network sezyice port profile for each in a predefined host group; 
determining observed network service port numbers being used by every host in 
the predefined host group for each valid connection or data flow; 

comparing the allowed netwoik service port profile wifii observed network s^r^ce 
20 port mmibers; and 

gCTierating an alarm when an the observed network service port number is not 
included in the allowed network service port profile. 

11. The method of claim 10, fiirther comprising the step of displaying indicia indicating 
25 the observed networic service port numbers during a present monitoring period. 



30 



12. The method of claim 1 1, fiirther comprising the step of displaying an indication of the 
observed network service port numbers which were previously seen during the 
presentm^t pmod. 
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13. The method of claim 12, further comprising the step of displaying indicia indicating 
whether the observed network service port numbers is included in the allowed network 
service port profile. 

5 14. The method of claim 10, further comprising the step of building the network service 
port profile based upon the observed network service ports observed during a profile 
generation time period. 

15. The method of claim 10, further comprising editing the allowed netwodc service port 
10 profile. 

16. The method of claim 15, further comprising the step of editing the allowed network 
service port profile for a block of network addresses. 

15 17. A s}^tem for determining unauthorized network usage, comprising: 

a monitoring device operable to observe communication packets on a network; 
a computer system operable to capture packet header information from observed 

commxmication packets; 

tiie computer system operable to determine valid coimections or data flows; 
20 the computer system operable to determine hosts on the network that act as a client 

and server for each valid connection or data flow; and 

the computer system operable to determine network services being used; and 
the computer system operable to generate an alarm when an observed network 

service is not an allowed network service. 

25 

18. The sj^em of claim 17, further comprising a monitor coupled to the computer system 
operable to display uidicia indicating observed network services during a monitoring 
period. 



30 



19. The system of claim 18, further comprising the monitor operable to display indicia 
indicating wheflier the observed network services is not an allowed network service. 
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20. The system of claim 17, fiirther conq>rising the computer system operable to build a 
network service profile based upon network services observed during a profile generation 
time p^od. 

5 21. The system of claim 1 7, further coiiq>rising an editor couple to the computer system 
operable to edit the allowed network services profile. 

22. The system of claim 21, further comprising the editor operable to edit the allowed 
network services profile for a block of network address. 



10 
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